Skip to content

Conversation

@jpsamaroo
Copy link
Member

No description provided.

@maleadt
Copy link
Member

maleadt commented Jul 28, 2025

The problem with the added KernelState (which shouldn't be wrapped in a Ref) is that GPUCompiler passes it as a value and not a reference, see https://github.com/JuliaGPU/GPUCompiler.jl/blob/32b4fc87eeece6302dd47cf20e255ee510acfc4a/src/irgen.jl#L498-L504, to work around performance regressions introduced by passing it as a byval pointer, see JuliaGPU/CUDA.jl#1167. So if we want to extend KernelState support to back-ends that expect a pointer, we'll have to revert JuliaGPU/GPUCompiler.jl@5eb6098, make sure it doesn't re-introduce said performance regressions (presumably by not relying on NVPTX' byval handling but by manually converting it to values like we did before). This is a little tricky, because the expansion of byval needs optimization to clean-up, so KernelState argument insertion will need to be moved earlier in the pipeline. I think this should take a day or so, and I should have some time to look at this somewhere in the coming weeks.

@maleadt
Copy link
Member

maleadt commented Sep 3, 2025

JuliaGPU/GPUCompiler.jl#715 is a possible fix for this.

@christiangnrd
Copy link
Member

Replaced by #657

@maleadt maleadt deleted the jps/rng branch October 8, 2025 11:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants